Mining NANOG Mailing List∗
نویسندگان
چکیده
It had been shown that the misbehaviors by few malicious, compromised or misconfigured BGP routers could lead to serious outages in Internet. This failing becomes progressively crucial as the recent prosper of outage-sensitive applications such as Voice over IP, streaming media, and video conferencing. To address these misbehaviors, previous work mainly focus on distributedly detect or prevent outages using limited state of Internet. In this paper, we present a first step towards efficient troubleshooting by mining network operators mailing lists. Using Natural Language Processing (NLP) and Machine Learning, we develop a new approach to extract useful information from the mailing forum on North American Network Operators Group (NANOG). Our experimental results show that the proposed approach detects 94 out of 105 outages from NANOG with a false positive rate of only 7.3%. We validate the extracted outage using real network logs collected by Route Views project. While our approach is not perfectly accurate, we envision it to be a useful information to existing anomaly detection/prevension mechanisms.
منابع مشابه
Multi-Data Mining for Understanding Leadership Behavior
We propose an approach for understanding leadership behavior in dot-jp, a non-profit organization, by analyzing heterogeneous multi-data composed of questionnaires and mailing list archives. Attitudes toward leaders were obtained from the questionnaires, and human networks were extracted from the mailing list archives. By integrating the results, we discovered that leaders must receive messages...
متن کاملAuthorship Identification for Heterogeneous Documents
The study of authorship identification in Japanese has for the most part been restricted to literary texts using basic statistical methods. In the present study, authors of mailing list messages are identified using a machine learning technique (Support Vector Machines). In addition, the classifier trained on the mailing list data is applied to identify the author of Web documents in order to i...
متن کاملA Tool for Identifying Swarm Intelligence on a Free/open Source Software Mailing List
A software tool designed using the concepts of swarm intelligence and text mining is proposed as an aid in the analysis of free/open source software (FOSS) development communities. A prototype of the tool collects textual data from an electronic mailing list, a primary mode of FOSS developer communication. The tool enables a user to compare patterns of discussion topics found in the text with p...
متن کاملInternet Outages, the Eyewitness Accounts: Analysis of the Outages Mailing List
Understanding network reliability and outages is critical to the “health” of the Internet infrastructure. Unfortunately, our ability to analyze Internet outages has been hampered by the lack of access to public information from key players. In this paper, we leverage a somewhat unconventional dataset to analyze Internet reliability—the outages mailing list. The mailing list is an avenue for net...
متن کاملExploring the Music Library Association Mailing List: A Text Mining Approach
Music librarians and people pursuing music librarianship have exchanged emails via the Music Library Association Mailing List (MLA-L) for decades. The list archive is an invaluable resource to discover new insights on music information retrieval from the perspective of the music librarian community. This study analyzes a corpus of 53,648 emails posted on MLA-L from 2000 to 2016 by using text mi...
متن کامل